Readability Consideration in Speech Synthesis Recording Script Selection
نویسندگان
چکیده
Designing text scripts that cover enough phonetic units and prosodic phenomena is very important when recording speech database for corpus based speech synthesis. When designing recording scripts for speech synthesis databases, a lot of effort is often placed on how to achieve maximal coverage of phonetic units in minimal speech recording. However, when we try to select sentences that have optimal coverage of the speech phenomena, some sentences with difficult words or incorrect grammar are often selected. It is difficult for speakers to read these sentences correctly and naturally at the same time. In order to address the problem in building speech database, we propose a selection process to create easy-to-read text scripts for recording in this paper. In this work, we will consider how to build a candidate set that is easy to read so that the speaker can utter it in the most natural way. We will calculate the statistics of the English text by analyzing the English Gigaword corpus, and filer out the sentences containing infrequent words and bigrams. The experiment shows that the selected scripts have good unit coverage of the language and good readability.
منابع مشابه
Considering readability in text-to-speech recording script design
Designing text scripts that cover enough phonetic units and prosodic phenomena is very important when recording speech database for corpus based speech synthesis. When designing recording scripts for speech synthesis databases, a lot of effort is often placed on how to achieve maximal coverage of phonetic units in minimal speech recording. With such methods, sentences with difficult words or in...
متن کاملExpressive prosody for unit-selection speech synthesis
Current unit selection speech synthesis voices cannot produce emphasis or interrogative contours because of a lack of the necessary prosodic variation in the recorded speech database. A method of recording script design is proposed which addresses this shortcoming. Appropriate components were added to the target cost function of the Festival Multisyn engine, and a perceptual evaluation showed a...
متن کاملExpressive Prosody for Unit-sele
Current unit selection speech synthesis voices cannot produce emphasis or interrogative contours because of a lack of the necessary prosodic variation in the recorded speech database. A method of recording script design is proposed which addresses this shortcoming. Appropriate components were added to the target cost function of the Festival Multisyn engine, and a perceptual evaluation showed a...
متن کاملStatistical Parametric Speech Synthesis of Malay Language using Found Training Data
The preparation of training data for statistical parametric speech synthesis can be sophisticated. To ensure the good quality of synthetic speech, high quality low noise recording must be prepared. The preparation of recording script can be also tremendous from words collection, words selection and sentences design. It requires tremendous human effort and takes a lot of time. In this study, we ...
متن کاملGenerating Script Using Statis the Context Variation
A statistical selection method is proposed for generating an optimized recording script for Concatenative Speech Synthesizer. This method starts with traveling a large text corpus to collect the statistical information of the Context Variation Unit Vectors (CVUV), which represent the multi-dimension phonetic contexts and properties of the synthesis unit. Each CVUV descriptor is organized as a n...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. J. of Asian Lang. Proc.
دوره 19 شماره
صفحات -
تاریخ انتشار 2009